Neural Embedding Allocation: Distributed Representations of Topic Models
نویسندگان
چکیده
Abstract We propose a method that uses neural embeddings to improve the performance of any given LDA-style topic model. Our method, called embedding allocation (NEA), deconstructs models (LDA or otherwise) into interpretable vector-space words, topics, documents, authors, and so on, by learning mimic demonstrate NEA improves coherence scores original model smoothing out noisy topics when number is large. Furthermore, we show NEA’s effectiveness generality in deconstructing LDA, author-topic models, recent mixed membership skip-gram achieve better with compared several state-of-the-art models.
منابع مشابه
Concurrent Inference of Topic Models and Distributed Vector Representations
Topic modeling techniques have been widely used to uncover dominant themes hidden inside an unstructured document collection. Though these techniques first originated in the probabilistic analysis of word distributions, many deep learning approaches have been adopted recently. In this paper, we propose a novel neural network based architecture that produces distributed representation of topics ...
متن کاملDiscriminative Neural Topic Models
We propose a neural network based approach for learning topics from text and image datasets. The model makes no assumptions about the conditional distribution of the observed features given the latent topics. This allows us to perform topic modelling efficiently using sentences of documents and patches of images as observed features, rather than limiting ourselves to words. Moreover, the propos...
متن کاملIncorporating Topic Priors into Distributed Word Representations
Representing words as continuous vectors enables the quantification of semantic relationships of words by vector operations, thereby has attracted much attention recently. This paper proposes an approach to combine continuous word representation and topic modeling, by encoding words based on their topic distributions in the hierarchical softmax, so as to introduce the prior semantic relevance i...
متن کاملDistributed Algorithms for Topic Models
We describe distributed algorithms for two widely-used topic models, namely the Latent Dirichlet Allocation (LDA) model, and the Hierarchical Dirichet Process (HDP) model. In our distributed algorithms the data is partitioned across separate processors and inference is done in a parallel, distributed fashion. We propose two distributed algorithms for LDA. The first algorithm is a straightforwar...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computational Linguistics
سال: 2022
ISSN: ['1530-9312', '0891-2017']
DOI: https://doi.org/10.1162/coli_a_00457